AITopics | behavior space

Collaborating Authors

behavior space

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Generating Behaviorally Diverse Policies with Latent Diffusion Models

Neural Information Processing SystemsApr-25-2026, 08:47:20 GMT

Recent progress in Quality Diversity Reinforcement Learning (QD-RL) has enabled learning a collection of behaviorally diverse, high performing policies. However, these methods typically involve storing thousands of policies, which results in high space-complexity and poor scaling to additional behaviors. Condensing the archive into a single model while retaining the performance and coverage of the original collection of policies has proved challenging. In this work, we propose using diffusion models to distill the archive into a single generative model over policy parameters. We show that our method achieves a compression ratio of 13x while recovering 98% of the original rewards and 89% of the original humanoid archive coverage. Further, the conditioning mechanism of diffusion models allows for flexibly selecting and sequencing behaviors, including using language.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.15)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.67)

Add feedback

Generating Behaviorally Diverse Policies with Latent Diffusion Models

Neural Information Processing SystemsFeb-8-2026, 08:47:45 GMT

Quality Diversity (QD) is an emerging field in which collections of high performing, behaviorally diverse solutions are trained.

archive, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > California (0.14)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Soft Quality-Diversity Optimization

Hedayatian, Saeed, Nikolaidis, Stefanos

arXiv.org Artificial IntelligenceDec-2-2025

Quality-Diversity (QD) algorithms constitute a branch of optimization that is concerned with discovering a diverse and high-quality set of solutions to an optimization problem. Current QD methods commonly maintain diversity by dividing the behavior space into discrete regions, ensuring that solutions are distributed across different parts of the space. The QD problem is then solved by searching for the best solution in each region. This approach to QD optimization poses challenges in large solution spaces, where storing many solutions is impractical, and in high-dimensional behavior spaces, where discretization becomes ineffective due to the curse of dimensionality. We present an alternative framing of the QD problem, called \emph{Soft QD}, that sidesteps the need for discretizations. We validate this formulation by demonstrating its desirable properties, such as monotonicity, and by relating its limiting behavior to the widely used QD Score metric. Furthermore, we leverage it to derive a novel differentiable QD algorithm, \emph{Soft QD Using Approximated Diversity (SQUAD)}, and demonstrate empirically that it is competitive with current state of the art methods on standard benchmarks while offering better scalability to higher dimensional problems.

behavior space, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2512.0081

Country:

Europe (1.00)
North America > Canada (0.67)
North America > United States > California > Los Angeles County > Los Angeles (0.28)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)

Add feedback

Vector Quantized-Elites: Unsupervised and Problem-Agnostic Quality-Diversity Optimization

Tsakonas, Constantinos, Chatzilygeroudis, Konstantinos

arXiv.org Artificial IntelligenceNov-21-2025

Quality-Diversity algorithms have transformed optimization by prioritizing the discovery of diverse, high-performing solutions over a single optimal result. However, traditional Quality-Diversity methods, such as MAP-Elites, rely heavily on predefined behavior descriptors and complete prior knowledge of the task to define the behavior space grid, limiting their flexibility and applicability. In this work, we introduce Vector Quantized-Elites (VQ-Elites), a novel Quality-Diversity algorithm that autonomously constructs a structured behavior space grid using unsupervised learning, eliminating the need for prior task-specific knowledge. At the core of VQ-Elites is the integration of Vector Quantized Variational Autoencoders, which enables the dynamic learning of behavior descriptors and the generation of a structured, rather than unstructured, behavior space grid -- a significant advancement over existing unsupervised Quality-Diversity approaches. This design establishes VQ-Elites as a flexible, robust, and task-agnostic optimization framework. To further enhance the performance of unsupervised Quality-Diversity algorithms, we introduce behavior space bounding and cooperation mechanisms, which significantly improve convergence and performance, as well as the Effective Diversity Ratio and Coverage Diversity Score, two novel metrics that quantify the actual diversity in the unsupervised setting. We validate VQ-Elites on robotic arm pose-reaching, mobile robot space-covering, and MiniGrid exploration tasks. The results demonstrate its ability to efficiently generate diverse, high-quality solutions, emphasizing its adaptability, scalability, robustness to hyperparameters, and potential to extend Quality-Diversity optimization to complex, previously inaccessible domains.

artificial intelligence, evolutionary algorithm, machine learning, (21 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TEVC.2025.3631786

2504.08057

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.87)

Industry: Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

Complex System Exploration with Interactive Human Guidance

Morel, Bastien, Moulin-Frier, Clément, Barla, Pascal

arXiv.org Artificial IntelligenceOct-2-2025

The diversity of patterns that emerge from complex systems motivates their use for scientific or artistic purposes. When exploring these systems, the challenges faced are the size of the parameter space and the strongly non-linear mapping between parameters and emerging patterns. In addition, artists and scientists who explore complex systems do so with an expectation of particular patterns. Taking these expectations into account adds a new set of challenges, which the exploration process must address. We provide design choices and their implementation to address these challenges; enabling the maximization of the diversity of patterns discovered in the user's region of interest -- which we call the constrained diversity -- in a sample-efficient manner. The region of interest is expressed in the form of explicit constraints. These constraints are formulated by the user in a system-agnostic way, and their addition enables interactive system exploration leading to constrained diversity, while maintaining global diversity.

artificial intelligence, evolutionary algorithm, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2510.00794

Country:

Europe (0.46)
North America > United States (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.47)

Add feedback

Navigating Trade-offs: Policy Summarization for Multi-Objective Reinforcement Learning

Osika, Zuzanna, Zatarain-Salazar, Jazmin, Oliehoek, Frans A., Murukannaiah, Pradeep K.

arXiv.org Artificial IntelligenceNov-7-2024

Multi-objective reinforcement learning (MORL) is used to solve problems involving multiple objectives. An MORL agent must make decisions based on the diverse signals provided by distinct reward functions. Training an MORL agent yields a set of solutions (policies), each presenting distinct trade-offs among the objectives (expected returns). MORL enhances explainability by enabling fine-grained comparisons of policies in the solution set based on their trade-offs as opposed to having a single policy. However, the solution set is typically large and multi-dimensional, where each policy (e.g., a neural network) is represented by its objective values. We propose an approach for clustering the solution set generated by MORL. By considering both policy behavior and objective values, our clustering method can reveal the relationship between policy behaviors and regions in the objective space. This approach can enable decision makers (DMs) to identify overarching trends and insights in the solution set rather than examining each policy individually. We tested our method in four multi-objective environments and found it outperformed traditional k-medoids clustering. Additionally, we include a case study that demonstrates its real-world application.

cluster 1, objective, objective space, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.3233/FAIA240830

2411.04784

Country:

North America > United States > New Jersey > Essex County > Newark (0.04)
Europe > Netherlands > South Holland > Delft (0.04)
Asia > Macao (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Quality with Just Enough Diversity in Evolutionary Policy Search

Templier, Paul, Grillotti, Luca, Rachelson, Emmanuel, Wilson, Dennis G., Cully, Antoine

arXiv.org Artificial IntelligenceMay-7-2024

Evolution Strategies (ES) are effective gradient-free optimization methods that can be competitive with gradient-based approaches for policy search. ES only rely on the total episodic scores of solutions in their population, from which they estimate fitness gradients for their update with no access to true gradient information. However this makes them sensitive to deceptive fitness landscapes, and they tend to only explore one way to solve a problem. Quality-Diversity methods such as MAP-Elites introduced additional information with behavior descriptors (BD) to return a population of diverse solutions, which helps exploration but leads to a large part of the evaluation budget not being focused on finding the best performing solution. Here we show that behavior information can also be leveraged to find the best policy by identifying promising search areas which can then be efficiently explored with ES. We introduce the framework of Quality with Just Enough Diversity (JEDi) which learns the relationship between behavior and fitness to focus evaluations on solutions that matter. When trying to reach higher fitness values, JEDi outperforms both QD and ES methods on hard exploration tasks like mazes and on complex control problems with large policies.

archive, fitness, jedi, (17 more...)

arXiv.org Artificial Intelligence

2405.04308

Country:

North America > United States > District of Columbia > Washington (0.05)
Europe > France > Occitanie > Haute-Garonne > Toulouse (0.05)
North America > United States > New York > New York County > New York City (0.04)
(5 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

Add feedback

Diversity from Human Feedback

Wang, Ren-Jian, Xue, Ke, Wang, Yutong, Yang, Peng, Fu, Haobo, Fu, Qiang, Qian, Chao

arXiv.org Artificial IntelligenceDec-10-2023

Diversity plays a significant role in many problems, such as ensemble learning, reinforcement learning, and combinatorial optimization. How to define the diversity measure is a longstanding problem. Many methods rely on expert experience to define a proper behavior space and then obtain the diversity measure, which is, however, challenging in many scenarios. In this paper, we propose the problem of learning a behavior space from human feedback and present a general method called Diversity from Human Feedback (DivHF) to solve it. DivHF learns a behavior descriptor consistent with human preference by querying human feedback. The learned behavior descriptor can be combined with any distance measure to define a diversity measure. We demonstrate the effectiveness of DivHF by integrating it with the Quality-Diversity optimization algorithm MAP-Elites and conducting experiments on the QDax suite. The results show that DivHF learns a behavior space that aligns better with human requirements compared to direct data-driven approaches and leads to more diverse solutions under human preference. Our contributions include formulating the problem, proposing the DivHF method, and demonstrating its effectiveness through experiments.

behavior descriptor, behavior space, human preference, (14 more...)

arXiv.org Artificial Intelligence

2310.06648

Country:

Asia > China > Jiangsu Province > Nanjing (0.05)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > Maryland > Baltimore (0.04)
(13 more...)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)

Add feedback

Quality Diversity under Sparse Reward and Sparse Interaction: Application to Grasping in Robotics

Huber, J., Hélénon, F., Coninx, M., Amar, F. Ben, Doncieux, S.

arXiv.org Artificial IntelligenceOct-31-2023

Quality-Diversity (QD) methods are algorithms that aim to generate a set of diverse and high-performing solutions to a given problem. Originally developed for evolutionary robotics, most QD studies are conducted on a limited set of domains - mainly applied to locomotion, where the fitness and the behavior signal are dense. Grasping is a crucial task for manipulation in robotics. Despite the efforts of many research communities, this task is yet to be solved. Grasping cumulates unprecedented challenges in QD literature: it suffers from reward sparsity, behavioral sparsity, and behavior space misalignment. The present work studies how QD can address grasping. Experiments have been conducted on 15 different methods on 10 grasping domains, corresponding to 2 different robot-gripper setups and 5 standard objects. An evaluation framework that distinguishes the evaluation of an algorithm from its internal components has also been proposed for a fair comparison. The obtained results show that MAP-Elites variants that select successful solutions in priority outperform all the compared methods on the studied metrics by a large margin. We also found experimental evidence that sparse interaction can lead to deceptive novelty. To our knowledge, the ability to efficiently produce examples of grasping trajectories demonstrated in this work has no precedent in the literature.

algorithm, archive, behavior space, (14 more...)

arXiv.org Artificial Intelligence

2308.05483

Country: